Reach-避免最佳控制问题,其中系统必须在保持某些目标条件的同时保持清晰的不可接受的故障模式,是自主机器人系统的安全和活力保证的核心,但它们的确切解决方案是复杂的动态和环境的难以解决。最近的钢筋学习方法的成功与绩效目标大致解决最佳控制问题,使其应用​​于认证问题有吸引力;然而,加固学习中使用的拉格朗日型客观不适合编码时间逻辑要求。最近的工作表明,在将加强学习机械扩展到安全型问题时,其目标不是总和,但随着时间的推移最小(或最大)。在这项工作中,我们概括了加强学习制定,以处理覆盖范围的所有最佳控制问题。我们推出了一个时间折扣 - 避免了收缩映射属性的贝尔曼备份,并证明了所得达到避免Q学习算法在类似条件下会聚到传统的拉格朗郎类型问题,从而避免任意紧凑的保守近似值放。我们进一步证明了这种配方利用深度加强学习方法,通过将近似解决方案视为模型预测监督控制框架中的不受信任的oracles来保持零违规保证。我们评估我们在一系列非线性系统上的提出框架,验证了对分析和数值解决方案的结果,并通过Monte Carlo仿真在以前的棘手问题中。我们的结果为一系列基于学习的自治行为开放了大门,具有机器人和自动化的应用。有关代码和补充材料,请参阅https://github.com/saferoboticslab/safett_rl。
translated by 谷歌翻译
Identifying similar network structures is key to capture graph isomorphisms and learn representations that exploit structural information encoded in graph data. This work shows that ego-networks can produce a structural encoding scheme for arbitrary graphs with greater expressivity than the Weisfeiler-Lehman (1-WL) test. We introduce IGEL, a preprocessing step to produce features that augment node representations by encoding ego-networks into sparse vectors that enrich Message Passing (MP) Graph Neural Networks (GNNs) beyond 1-WL expressivity. We describe formally the relation between IGEL and 1-WL, and characterize its expressive power and limitations. Experiments show that IGEL matches the empirical expressivity of state-of-the-art methods on isomorphism detection while improving performance on seven GNN architectures.
translated by 谷歌翻译
Non-additive measures, also known as fuzzy measures, capacities, and monotonic games, are increasingly used in different fields. Applications have been built within computer science and artificial intelligence related to e.g. decision making, image processing, machine learning for both classification, and regression. Tools for measure identification have been built. In short, as non-additive measures are more general than additive ones (i.e., than probabilities), they have better modeling capabilities allowing to model situations and problems that cannot be modeled by the latter. See e.g. the application of non-additive measures and the Choquet integral to model both Ellsberg paradox and Allais paradox. Because of that, there is an increasing need to analyze non-additive measures. The need for distances and similarities to compare them is no exception. Some work has been done for defining $f$-divergence for them. In this work we tackle the problem of defining the optimal transport problem for non-additive measures. Distances for pairs of probability distributions based on the optimal transport are extremely used in practical applications, and they are being studied extensively for their mathematical properties. We consider that it is necessary to provide appropriate definitions with a similar flavour, and that generalize the standard ones, for non-additive measures. We provide definitions based on the M\"obius transform, but also based on the $(\max, +)$-transform that we consider that has some advantages. We will discuss in this paper the problems that arise to define the transport problem for non-additive measures, and discuss ways to solve them. In this paper we provide the definitions of the optimal transport problem, and prove some properties.
translated by 谷歌翻译
在本文中,我们介绍了一种新的离线方法,以使用演示(LFD)范式学习,在考虑用户对任务的直觉的同时,使用示范(LFD)范式学习,实现稳定性和性能约束,以找到可变阻抗控制的合适参数。考虑到从人类示范获得的合规性概况,给出了VIC的线性参数变化(LPV),它允许陈述设计问题,包括稳定性和性能约束为线性矩阵不平等(LMIS)。因此,使用解决方案搜索方法,我们根据用户偏好在任务行为上找到最佳解决方案。通过比较获得的控制器的执行与在二维轨迹跟踪任务中不同用户首选项集的设计的解决方案来验证设计问题。将滑轮循环任务作为案例研究提出,以评估可变阻抗控制器的性能,并使用用户偏好机制对恒定的稳定性控制器进行恒定的敏捷性和倾斜度。所有实验均使用7-DOF Kinova Gen3操纵器进行。
translated by 谷歌翻译
基于模糊规则的系统(FRBS)是一个基于规则的系统,它使用语言模糊变量作为前身,因此代表人类可理解的知识。它们已应用于整个文献的各种应用和领域。但是,FRBS遭受了许多缺点,例如不确定性表示,大量规则,解释性损失,学习时间高的计算时间等,以克服FRBS的这些问题,存在许多范围的FRBS。在本文中,我们介绍了模糊系统(FRBS)的各种类型和突出领域的概述和文献综述,即遗传模糊系统(GFS),层次结构模糊系统(HFS),Neuro Fuzzy System(NFS),不断发展的模糊系统(EFS)(EFS)(EFS) ),在2010 - 2021年期间,用于大数据的FRBS,用于数据不平衡数据的FRBS,用于不平衡数据的FRBS,用于使用集群质心作为模糊规则的FRB和FRBS。 GFS使用遗传/进化方法来提高FRBS的学习能力,HFS解决了FRBS的尺寸诅咒,NFS在EFS中考虑使用神经网络和动态系统来提高FRBS的近似能力,并且在EFS中考虑了动态系统。 FRBs被视为大数据和不平衡数据的好解决方案,近年来,由于高维度和大数据和规则,使用集群质心来限制FRBS中的规则数量,因此FRBS的可解释性已受欢迎。本文还强调了该领域的重要贡献,出版统计和当前趋势。该论文还涉及几个需要从FRBS研究社区进一步关注的开放研究领域。
translated by 谷歌翻译
基于宽度的搜索方法在广泛的测试平台中显示了最先进的性能,从经典计划问题到基于图像的模拟器,例如Atari游戏。这些方法刻度独立于状态空间的大小,但在问题宽度中指数呈指数。在实践中,运行宽度大于1的算法是计算难以解决的,禁止IW解决更高的宽度问题。在本文中,我们介绍了一个分层算法,该算法在两个抽象级别中计划。高级计划者使用从低级修剪决策中逐步发现的抽象功能。我们在经典规划PDDL域中以及基于像素的模拟器域中说明了该算法。在古典规划中,我们展示了IW(1)在两个级别的抽象中如何解决宽度2的问题。对于基于像素的域,我们展示了如何结合学习的策略和学习价值函数,所提出的分层IW可以胜过目前具有稀疏奖励的Atari游戏的扁平IW策划者。
translated by 谷歌翻译